Rough Set Strategies to Data with Missing Attribute Values
نویسنده
چکیده
In this paper we assume that a data set is presented in the form of the incompletely specified decision table, i.e., some attribute values are missing. Our next basic assumption is that some of the missing attribute values are lost (e.g., erased) and some are "do not care" conditions (i.e., they were redundant or not necessary to make a decision or to classify a case). Incompletely specified decision tables are described by characteristic relations, which for completely specified decision tables are reduced to the indiscernibility relation. It is shown how to compute characteristic relations using an idea of block of attribute-value pairs, used in some rule induction algorithms, such as LEM2. Moreover, the set of all characteristic relations for a class of congruent incompletely specified decision tables, defined in the paper, is a lattice. Three definitions of lower and upper approximations are introduced. Finally, it is shown that the presented approach to missing attribute values may be used for other kind of missing attribute values than lost values and "do not care" conditions.
منابع مشابه
Mining Incomplete Data with Many Missing Attribute Values A Comparison of Probabilistic and Rough Set Approaches
In this paper, we study probabilistic and rough set approaches to missing attribute values. Probabilistic approaches are based on imputation, a missing attribute value is replaced either by the most probable known attribute value or by the most probable attribute value restricted to a concept. In this paper, in a rough set approach to missing attribute values we consider two interpretations of ...
متن کاملA comparison of traditional and rough set approaches to missing attribute values in data mining
Real-life data sets are often incomplete, i.e., some attribute values are missing. In this paper we compare traditional, frequently used methods of handling missing attribute values, which are based on preprocessing, with another class of methods dealing with missing attribute values in which rule induction is performed directly on incomplete data sets, i.e., handling missing attribute values a...
متن کاملA Comparative Study on Decision Rule Induction for incomplete data using Rough Set and Random Tree Approaches
Handling missing attribute values is the greatest challenging process in data analysis. There are so many approaches that can be adopted to handle the missing attributes. In this paper, a comparative analysis is made of an incomplete dataset for future prediction using rough set approach and random tree generation in data mining. The result of simple classification technique (using random tree ...
متن کاملComparisons on Different Approaches to Assign Missing Attribute Values
A commonly-used and naive solution to process data with missing attribute values is to ignore the instances which contain missing attribute values. This method may neglect important information within the data, significant amount of data could be easily discarded, and the discovered knowledge may not contain significant rules. Some methods, such as assigning the most common values or assigning ...
متن کاملRough set approach to incomplete numerical data
The theory rough set successfully implemented diberbagai sector , but rough set model classical can only associated with the data complete and set of data in symbolic form ( Jianhua , dai , 2013 ) . Research by adopting the theory rough set conducted in attribute numerical and the value of an attribute lost ( Jerzy w , Grzymala-Buse and Zdzislaw S , Hippe , Nov .2011 ) .In this research discuss...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006